Moonshine: Distilling with Cheap Convolutions
نویسندگان
چکیده
Model distillation compresses a trained machine learning model, such as a neural network, into a smaller alternative such that it could be easily deployed in a resource limited setting. Unfortunately, this requires engineering two architectures: a student architecture smaller than the first teacher architecture but trained to emulate it. In this paper, we present a distillation strategy that produces a student architecture that is a simple transformation of the teacher architecture. Recent model distillation methods allow us to preserve most of the performance of the trained model after replacing convolutional blocks with a cheap alternative. In addition, distillation by attention transfer provides student network performance that is better than training that student architecture directly on data.
منابع مشابه
Parallel Computation of a Sequence of Convolutions
Parallel programming is a good way for increasing the computational speed. Any network of workstations can be used as a relatively cheap parallel computer. We will show two methods for parallel computation of sequences of convolutions in this paper. They are pipelining and data partitioning. We will show the algorithms for these methods, performance analysis of them and results of some experime...
متن کاملMonstrous and Generalized Moonshine and Permutation Orbifolds
We consider the application of permutation orbifold constructions towards a new possible understanding of the genus zero property in Monstrous and Generalized Moonshine. We describe a theory of twisted Hecke operators in this setting and conjecture on the form of Generalized Moonshine replication formulas.
متن کاملProof of the umbral moonshine conjecture
The Umbral Moonshine Conjectures assert that there are infinite-dimensional graded modules, for prescribed finite groups, whose McKay–Thompson series are certain distinguished mock modular forms. Gannon has proved this for the special case involving the largest sporadic simple Mathieu group. Here, we establish the existence of the umbral moonshine modules in the remaining 22 cases. Mathematics ...
متن کاملDecomposition of the Moonshine Module with respect to a code over Z2k
In this paper, we give a decomposition of the moonshine module V ♮ with respect to an extremal Type II code over Z2k for an integer k ≥ 2. Then we obtain automorphisms of V ♮, some 4A and 2B elements of the Monster with respect to the decomposition. We give examples of such a decomposition for some k and give the McKay-Thompson series for a 4A element.
متن کامل06 9 v 1 1 6 N ov 1 99 2 DIAS - STP - 92 - 29 MONSTROUS MOONSHINE AND THE UNIQUENESS OF THE MOONSHINE MODULE
In this talk we consider the relationship between the conjectured uniqueness of the Moonshine module V of Frenkel, Lepowsky and Meurman and Monstrous Moonshine, the genus zero property for Thompson series discovered by Conway and Norton. We discuss some evidence to support the uniqueness of V by considering possible alternative orbifold constructions of V from a Leech lattice compactified strin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.02613 شماره
صفحات -
تاریخ انتشار 2017